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REPLICATION FACILITY 



TECHNICAL FIELD 5 

The present invention relates generally to data processing 
systems and, more particularly, to replication facilities used 
within distributed systems. 

10 

BACKGROUND OF THE INVENTION 

Replication facilities have been provided in a number of 
different types of software products. For instance, replica- 
tion facilities have been incorporated in database products, 
network directory service products, and group ware prod- 15 
ucls. Many of the conventional replication facilities are 
limited in terms of what they can replicate. For instance, 
many conventional replicators can only replicate one type of 
logical structure (i.e., a file). Furthermore, the conventional 
replicators arc limited in terms of the quantity of the logical 20 
structures that may be replicated at a lime. In particular, 
many conventional replicators can only replicate one file at 
a time. 

25 

SUMMARY OF THE INVENTION 

In accordance with a first aspect of a preferred embodi- 
ment of the present invention, a method is practiced in a 
distributed system having a replication facility and a number 3Q 
of computer systems that each include a storage device. In 
this method, a plurality of files are provided and organized 
into a tree. A single one of the files is replicated using the 
replication facility such that a copy of the file is stored in the 
storage device of a different computer system than the 35 
original copy of the file. A subtree of files of multiple levels 
is also replicated. The subtree is originally stored on the 
storage device of one of the computer systems. Replication 
is performed using the replication facility such that a copy 
of the subtree and its files are stored in the storage device in ^ 
another of the computer systems. 

In accordance with another aspect of the present inven- 
tion, a first copy of a file is provided in one of the computer 
systems. A second copy of the file is provided in another of 
the computer systems. The first copy of the file is reconciled 45 
with the second copy of the file using a reconciler facility. 
The reconciliation ensures that the second copy of the file 
incorporates any changes made to the first copy of the file. 
A first copy of a group of files is provided in one of the 
computer syslems, and a second copy of the group of files 50 
is provided in another of the computer systems. The recon- 
ciler facility is used to reconcile the first copy of the group 
of files with the second copy of the group of files so that the 
second copy of the group of files incorporates any changes 
made to the first copy of the group of files since last 55 
reconciled. 

In accordance with a further aspect of the present inven- 
tion, a first copy of a group of files is stored in the storage 
device of a first of the computer systems. A second copy of 
the group of files is stored in the storage device of a second 60 
of die computer systems. Changes arc made to at least one 
of the files in the first copy of a group of files. The changes 
are propagated to the second group of files upon the occur- 
rence of an event Additional changes arc made to at least 
one of the files in die first copy of a group of files, and these 65 
changes arc also propagated to the second copy of a group 
of files upon the occurrence of another event. 
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In accordance with yet another aspect of the present 
invention, a first copy of a group of files is stored in the 
storage device of the first computer system. The second copy 
of the group of files is stored in the storage device of a 
second computer system. Any changes made to the first copy 
of the group of files are incrementally sent to the second 
computer system so that the changes may be made to the 
second copy of the group of files. 

In accordance with an additional aspect of the present 
invention, a first set of files mat are stored in one of the 
storage devices is specified to be replicated. A filter is 
specified for determining what files in the first set of files are 
to be replicated The files specified by the filter arc replicated 
using the replication facility to produce a second set of files. 

In accordance with a still further aspect of the present 
invention, files having names are stored in the storage 
devices of the computer systems of the distributed system. 
A distributed namespace is provided. The distributed 
namespace comprises a logical organization of the names of 
the stored files. Selected portions of a group of files in the 
namespace are replicated to create new files holding the 
selected portions of the files. 

In accordance with a further aspect of the present inven- 
tion, a first copy of a set of files of a given class are stored 
in a first computer system. A second copy of the set of files 
are stored in a second computer system. The first copy of the 
set of files is reconciled with the second copy of the set of 
files using a class-specific reconciler that only reconciles 
files of the given class. The files may be stored as persistent 
objects, which are organized into classes. Objects and 
classes will be discussed below. 

In accordance with another aspect of the present inven- 
tion, an application program is run on one of the computer 
systems of a distributed system. A request is made within the 
application program to a private replication mechanism to 
replicate a set of files. Each of the files maintains a list of 
processes that arc permitted to access the file. The set of files 
is replicated using the private replication mechanism to 
produce a new set of files without replicating the list of 
processes that are permitted to access the file. 

in accordance with a further aspect of the present inven- 
tion, a first copy of a group of files is provided in a first 
computer system and a second copy of the group of files is 
provided in a second computer system. Changes arc made to 
the first copy of a group of files. An agent is provided for the 
first copy of group of files. Each agent has access rights to 
access and read the files in the first copy of the group of files. 
A reconciler is provided at the second computer system for 
reconciling the second copy of the group of files with the 
first copy of the group of files. A proxy is granted from the 
agent of the first copy of the group of files to the reconciler. 
The proxy grants the reconciler limited authority to access 
and read the files in the first copy of the group of files. The 
reconciler then reconciles the second copy of the group of 
files with the first copy of the group of files using the 
reconciler so that changes dial were made to the first copy 
of group of files is also made to the second copy of group of 
files. 

In accordance with a final aspect of the present invention, 
a method is practiced in a distributed system. In this method, 
heterogeneous file systems are provided in the distributed 
system. A storage manager is provided for each file system 
to manage access to the files held therein. In response to a 
request to reconcile a first set of files with a second set of 
files, access is granted to the first set of files by the storage 
manager for the file system mat holds the first set of files and 
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access is granted 10 the second set of files by the storage 
manager Tor the file system that holds the second set of files. 
The first object set is reconciled with the second object set 
under the control of the storage managers of the respective 
file systems that hold the first set of files and the second set 5 
of files. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1A is a block diagram of a distributed system J0 
suitable for practicing a preferred embodiment of the present 
invention. 

FIG. IB is a diagram of a distributed namespace for a 
distributed system in accordance with the preferred embodi- 
ment of the present invention. 15 

FIG. 2 is a block diagram of a change log used in the 
preferred embodiment of the present invention. 

FIG. 3 is a block diagram of a replication information 
block (RIB) used in the preferred embodiment of the present 
invention. 20 

FIG. 4 is a block diagram illustrating the functional 
components of the replication facility used in the preferred 
embodiment of the present invention. 

FIG. 5 is a diagram illustrating the interaction of elements 25 
that play a role in public replication in the preferred embodi- 
ment of the present invention. 

FIG. 6 is a flowchart of the steps performed in replication 
in the preferred embodiment of the present invention. 

FIG. 7 is a flowchart illustrating the steps performed to 30 
provide security during replication in the preferred embodi- 
ment of the present invention. 

DETAILED DESCRIPTION OF THE 

INVENTION 35 

A preferred embodiment of the present invention provides 
a replication facility for use in a distributed environment. 
The replication facility supports weakly consistent replica- 
tion of any subtree of persistent objects in the distributed 4Q 
namespace of the system. The replication facility may 
replicate single objects or may replicate logical structures 
that include multiple objects. The replication facility recon- 
ciles local copies of objects with remote copies of objects. 
Reconciliation occurs on a pair-wise basis such that each 45 
object in a local set of objects is reconciled with its corre- 
sponding object in the remote set of local objects. The 
reconciliation may occur over heterogeneous file systems. 

FIG. 1 A depicts a distributed system 10 that is suitable for 
practicing the preferred embodiment of the present inven- 50 
tion. The distributed system 10 includes an interconnection 
mechanism 12, such as a local area network (LAN), wide 
area network (WAN), or other interconnection mechanism, 
that interconnects a number of different data processing 
resources. The data processing resources include worksta- 5S 
tions 14, 16, 18 and 20, printers 22 and 24, and secondary 
storage devices 26 and 28. Each of the workstations 14, 16, 
18 and 20 includes a respective memory 30, 32, 34 and 36. 
Each of the memories 30, 32, 34 and 36 holds a copy of a 
distributed operating system 38. Each workstation 14, 16, 18 50 
and 20 may implement a separate file system. 

Those skilled in the art will appreciate that the present 
invention may be practiced on configurations other than the 
configuration shown in FIG. 1A. The distributed system 10 
shown in FIG. 1A is intended to be merely illustrative and 65 
not limiting of the present invention. For instance, the 
interconnection mechanism 12 may interconnect a number 
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of networks together mat arc running separate network 
operating systems. 

The preferred embodiment of the present invention allows 
users and system administrators to replicate persistent 
"objects". An object, in this context, is a logical structure 
that holds at least one data field. Groups of objects with 
similar properties and common semantics are organized into 
object classes. A number of different object classes may be 
defined for the distributed system 10. Although the preferred 
embodiment of the present invention employs objects, those 
skilled in the art will appreciate that the present invention is 
not limited to an object-oriented environment; rather, the 
present invention may also be practiced in non-object- 
oriented environments. The present invention is not limited 
to replication of objects; rather, it is more generalized to 
support the replication of logical structures, such as files or 
file directories. 

The operating system 38 includes a file system for storing 
the objects that are used in the preferred embodiment of the 
present invention. The objects arc organized into a distrib- 
uted namespace 19 (FIG. IB). The distributed namespace 19 
is a logical tree-like structure formed from the object names 
21 stored in the file system of the operating system 38. The 
distributed namespace 19 illustrates the hierarchy among the 
named objects of the system 10 (FIG. 1A). 

The replication facility of the preferred embodiment of 
the present invention provides not only for the duplication of 
objects so that objects may be distributed across the distrib- 
uted system, but also provides for reconciliation of multiple 
copies of objects (i.e., multimastcr replication). Reconcili- 
ation refers to reconciling an object with a changed object so 
that the object reflects the changes made to the changed 
object For instance, suppose that a remote copy of an object 
has been changed and a local copy of the object has not yet 
been updated to reflect the changes. Each object not only has 
contents but also has a name and location within (he dis- 
tributed file system. Reconciliation involves reconciling the 
two copies of the object such that the local copy of the object 
is changed in a like fashion to how the remote copy of the 
object was changed. The term "replication/' as used herein, 
refers to not only duplicating objects so that multiple copies 
of the objects arc distributed across the distributed system 
10, but also refers to reconciliation of the copies of the 
objects. 

Before discussing the preferred embodiment of the 
present invention in more detail below, it is helpful to 
introduce a few key concepts that will be referenced below. 
An "object set" is a collection of objects that arc grouped 
together for replication. An object set may include a single 
object or a sub-tree of objects. The object set is specified by 
the user or administrator who requests replication. A "replica 
set " in contrast, is a collection of systems which each own 
a local copy of an object set, and a "rcplica"is a member of 
a replica set. 

To insulate the replication facility from the underlying 
physical storage system (e.g., the type of file system 
employed to store objects) and to provide extensibility, the 
preferred embodiment of the present invention adopts the 
abstraction of a replicated object store (RcplStore). The 
ReplStorc abstraction allows the replication facility to be 
applied across heterogeneous file systems. The ReplStorc 
presents a group of interfaces that must be supported for an 
underlying physical storage system to support replication 
facilities. In particular, only those objects that reside in 
object stores that support the ReplStorc interfaces can be 
replicated. An interface is a named group of logically related 
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functions. The interface specifies signatures (such as param- vides load balancing by having more than one copy of an 

cters) for the group of related functions provided by an object stored across the distributed system 10 to limit the 

interface. The interface does not provide code for implc- load on any one copy of the object. Replication enhances 

meniing the functions; rather, the code for implementing the availability by allowing multiple copies of important objects 

function is provided by objects or by other implementations. 5 to be distributed across the system 10. The enhanced avail- 

Objects that provide the code for an instance of an interface ability increases the fault resilience of the system. Sped fi- 

are said to "support" the interface. The code provided by an cally, by having copies of important objects distributed 

object that supports an interface must comply with the across the system 10, users are less affected by failures 

signature specified within the interface. Thus, in the example within the system that prevent or limit access to objects. The 

described above, the object store that stores the objects in the 10 enhanced availability also enhances the performance of the 

object set must support the ReplStore interfaces in order for system. 

the object set to be replicated. Implementations of the The preferred embodiment of the present invention is 

ReplStore interfaces are provided for each of the file systems embodied in a replication facility 54 (FIG. 4) thai is part of 

within the distributed system 10 in order to support repli- the operating system 38. Nevertheless, those skilled in the 

cation over each of the file systems. 15 an win appreciate that the replication facility of the present 

Each ReplStore provides a mechanism for identifying invention may also be implemented in other environments, 

replicated objects on the local volume. This mechanism is including graphical user interfaces. As shown in FIG. 4, the 

the replicated object ID (ROB1D). The ROBID is an abstrac- replication facility 42 includes three primary functional 

tion that encapsulates the identity as well as other informa- components: a copying component 56, a reconciler compo- 

tion about an object that is being replicated. The ReplStore 20 ncnt 58 and a control component 60. The replication facility 

supports routines for serializing and deserializing ROBIDs. 54 uses the copying component 56 for duplication. In 

The ROBID of an object provides a mechanism for per- addition, the replication facility 54 reconciles copies of 

forming numerous operations. For instance, an object can be object sets using the reconciler component 58 to ensure that 

retrieved from storage using information contained in the they are consistent with each other. This reconciliation 

ROBID. Further, a component name of an object can be 25 insures a consistent view of the objects across the distributed 

derived from its ROBID. system 10. 

Each ReplStore maintains a replicated storage change log One level of control exerted by the control component 56 

40 (FIG. 2). The change log 40 includes a number of change concerns how replication is invoked. Replication may be 

items 42 mat specify changes that have been made to objects invoked manually or automatically. Manual invocation 

in the object set Each change item 42 includes a type field 30 requires thai an explicit request to replicate be made by a 

44, a serialized ROBID field 46 for the object that is user or other party. The user or other party must specify the 

changed, a time field 48 indicating the time that the change object set and the destination for replication. The destina- 

occurrcd (local time) and a replication information block tions arc not specified for each replication cycle; rather a 

(RIB) field 50 holding a RIB that is associated with the replica connection is specified initially. The replica connec- 

change. In the embodiment described herein, there are five 35 tion identifies the two replicas and the object set that are to 

types of changes that may be specified within die type field be involved in replication. In contrast, automatic invocation 

44. These changes arc deletion, creation, modification, occurs when replication is triggered by certain events 67 

renaming, and moving. A deletion occurs when an object is (see FIG. 5) or by the passage of a certain amount of time 

deleted. Creation occurs when the object is created. A (which may be construed as a type of event). Replication 

modification occurs when the contents of the object are 40 may be prcschcduled to occur at fixed lime intervals, 

modified in some way. A renaming occurs when the com- Another aspect of control exerted by the control mechanism 

poncnt name of the object is modified and moving occurs concerns who may invoke replication. Replication may be 

when the object is moved under a new parent in the invoked by an appropriately privileged party, 

distributed namespace of die system. ^ The preferred embodiment of the present invention pro- 

A cursor 49 is maintained within the change log 40 that vides two types of replication: public replication and private 

acts as an index into the list of change items 42. The cursor replication. Public replication refers to a process that may be 

49 acts as a marker in the list of change items 42. In addition, performed only by appropriately privileged parties to pro- 

a change log may include multiple cursors. The cursor 49 ducc a u public"copy of an object set. In public replication, 

may take the form of a time stamp. The cursor 49 may, for ^ each of the copies of the object set that are produced 

example, identify the beginning of changes that have cooperates with the other copies to maintain consistency, 

occurred after a point in lime. The nodes in the namespace that store the public copies, in 

Every object in an object set that is being replicated is aggregate, form a public replica set, and the members of the 

stamped with an RIB 51 (FIG. 3). The RIB 51 has three set keep state information to maintain consistency among 

fields: an originator field 57, a change identifier field 55, and 55 the copies. Access restrictions on the objects are preserved, 

a propagator field 57. The originator field 53 specifics where Changes that occur in a public copy of an object set arc 

the last change to the object occurred. The change identifier reconciled with other public copies, 

field 55, in contrast, identifies the last change to the object Private replication refers to a process for producing 

relative to the originator identi ficd within the originator field private copies of an object set. A private copy may be created 

53. Lastly, the propagator field 57 specifics the identity of 6q by any party, including a non-adminislrator. Not all mcm- 

ihe party who sent the change to the local site. When an bcrs of the replicated sets keep state information to maintain 

object is changed locally, the RIB 51 associated with the consistency among copies. Private replication will be dis- 

objecl is modified to reflect the local site as the originator cussed in more detail below. 

and the propagator. The change identifier is stamped appro- a number of elements play a role in the replication 

prialcly. 65 process in the preferred embodiment of the present inven- 

Rcplication is useful for the distributed system 10 in that tion. FIG. 5 is a diagram illustrating the elements that may 

it provides load balancing and availability. Replication pro- play a major role in the public replication process. Object 
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replication agents (ORAs) 62 and 64 are replicator objects 
that act as agents on behaJf of nodes in which object sets arc 
stored to provide automatic support for replication. Each 
machine in the distributed system has its own ORA. The 
ORAs 62 and 64 may act as remote procedure call (RPC) 5 
servers that service requests made on behalf of remote 
clients or may alternatively be other types of reliable com- 
munication mechanisms that serve a similar role. A separate 
ORA 64 is provided for a local object and another ORA 64 
is provided for the corresponding remote object in the public 
replication process. Local ORA 62 is responsible for loading 
a ReplStorc DLL 66 and a ReplStore Manager DLL 65. The 
Rcpl Store Manager 65 is responsible for regulating access to 
the ReplStorc 66. Clients call the ReplStore Manager 65 to 
load the appropriate ReplStore 66 for a given physical 
storage system. The ORAs 62 and 64 have a level of 15 
privilege that allows them to read and write all objects that 
arc being replicated from a local object store. The ORAs 62 
and 64 are responsible for replying to requests to exchange 
changes with other ORAs which maintain public replicas. 

A reconciler 68 also plays a role in the public replication 20 
process. It acts as a counterpart to the local ORA 62 to 
reconcile the local object set with the remote object corre- 
sponding. The reconciler 68 is called by the local ORA 62 
and is responsible for opening objects that arc to be rccon- ^ 
ciled. Two types of reconciler objects may be called by the 
reconciler 68. Specifically, a class-specific reconciler 70 
may be called or a default (i.e., class-independent) reconciler 
72 may be called. The class -specific reconciler 70 reconciles 
objects that have class specific requirements on replication. 3Q 
The class specific recorder 70 is applied to only a class of 
objects. The class-independent reconciler 72 reconciles 
objects regardless of their class. Multiple class-independent 
reconcilers may be available in the system 10. For instance, 
each object set may have its own class-independent recon- 35 
cilcr. Every replica set may be associated with its own class 
independent reconciler which is invoked whenever a class- 
specific reconciler is unavailable. Lastly, as mentioned 
above, events 67 may play a role in triggering replication. 

FIG. 6 is a flowchart of the steps performed for replication 40 
in the preferred embodiment of the present invention. Ini- 
tially, access is gained to a change log 40 (FIG. 2) for a 
remote object set (step 74 in FIG. 6). In particular, when a 
local object set is to be reconciled with a remote object set, 
the local ORA 62 (FIG. 5) contacts the remote ORA 64 via 45 
a remote procedure call mechanism. Hie local ORA 62 
contacts the remote OR A 64 to gain access to the change log 
40. A cursor 49 (FIG. 2) is then created in the change log 
(step 76 in FIG. 6). Specifically, the local ORA 62 stores a 
time stamp indicating the time of the last reconciliation 50 
between the object sets and then passes this lime stamp to 
the remote ORA 64 to be used as a cursor 49. The remote 
ORA 64 then passes this time stamp as a cursor into the 
remote change log 40. The cursor identifies items in the 
change log that have time stamps after the last reconciliation 55 
and, thus, are of interest for this replication cycle. 

A list of change items arc then obtained from the remote 
change log utilizing the cursor, to identify the change items 
that arc for changes that have occurred after the last recon- 
ciliation. The remote ORA 64 screens the RIBs 51 of each 60 
of the change items 42 to insure that the remote ORA docs 
not pass back to the local ORA 62 changes that originated at 
the local ORA (i.e., the remote ORA examines the originator 
field 53 of the RIBs) and examines the RIBs to insure that 
change items for changes thai were propagated from the 65 
local ORA (i.e., the remote ORA examines the propagator 
field 57 of the RIBs) arc not sent. The resulting change items 
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are passed back to the local ORA 62 where they arc stored 
persistently. The local ORA 62 then uses the reconciler 68 to 
perform namespace reconciliation (step 80) and content 
reconciliation (step 82) on the objects identified by the 
ROBIDs in the change items. In particular, the reconciler 56 
reconciles each object that has changed in the remote object 
set with corresponding objects of the local object set. Any 
changes that have been made to the remote object are made 
to the corresponding local object. Whether the class-specific 
reconciler 70 or the class-independent reconciler 72 is used 
depends upon the source (i.e., remote copy of an object). A 
class specific reconciler 70 is used only if the remote copy 
of the object requires such a reconciler. 

Namespace reconciliation is performed (see step 80 in 
FIG. 4) for any change recorded in a change item that is not 
strictly a content modification or that is not associated with 
a system property. Such changes include creations, dele- 
tions, moves, and renames. Namespace reconciliation occurs 
by comparing information obtainable by ROBIDs of local 
objects relative to information stored for corresponding 
remote objects. Many different ways for resolving name 
resolution conflicts may be used within the present inven- 
tion. The preferred embodiment of the present invention, 
however, adopts rules. A first rule used by the preferred 
embodiment of the present invention to resolve namespace 
conflicts is to select a last modification over a previous 
modification. When an object is moved/renamed at one site 
to have a first name, and the same object is moved or 
renamed to another site to have a different name, the last 
occurring change is chosen so that the object assumes the 
name associated with the last change. A second rule is used 
to resolve namespace collisions. A namespace collision 
occurs when two different objects are created, moved, or 
renamed to have the same name. The second rule specifies 
that whichever object was created, moved, or renamed first 
is the name that is selected for the object at the local site. 

Content reconciliation (sec step 70 in FIG. 4) involves 
reconciling contents of a local object with a remote object so 
that the local object includes the modifications made to the 
remote object. By examining the changes in the change log, 
the local objects may be changed to have the same contents 
as the remote objects. 

During replication, changes are propagated from one 
replica to another. Replication is "one way" in that the 
changes made to an initial copy of an object set arc made to 
a second copy of the object set. There is no immediate 
reciprocal action to copy the changes made to the second 
copy of the object set to the first copy of the object set. 
Nevertheless, such propagation to the first copy of the object 
set may be performed. Given this one way nature of repli- 
cation, each replica monitors how up to date a local copy of 
an object set is for a replica, cursors arc maintained into 
partner change logs. At the completion of each exchange 
during reconciliation, the two replicas exchange cursor 
information. 

Public replication poses a number of security issues. In 
general, reconcilers must be able to update objects in order 
to perform replication. The class-independent reconciler is a 
trusted system process, and, thus, does not pose a security 
risk. Class-specific reconcilers, however, arc not trusted 
system processes, and thus pose a security threat To help 
alleviate this security dilemma, the preferred embodiment of 
the present invention utilizes "proxies". 

A proxy is a delegation ticket that allows worker pro- 
cesses or remote processes that perform well-defined opera- 
tions without having extraordinary privileges. The proxy 
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packages credentials of the granting party and lends them to 
the parties seeking access to remote objects. The party 
seeking access may then step in the shoes of the granting 
party and access the necessary objects. These credentials 
may be encrypted. FIG. 7 is a flowchart of the steps 5 
performed to utilize a proxy in the preferred embodiment of 
the present invention. During the replication process, the 
remote ORA 64 (FIG. 5) gives a local reconciler 68 a proxy 
(step 84 in FIG. 7). As mentioned above, this proxy includes 
the appropriate credentials and access rights that are to be lQ 
granted by the remote ORA to the local reconciler. The 
reconciler 68 men sends the credentials to the remote site 
(step 86 in FIG. 7). In other words, the reconciler 68 presents 
the proxy to the remote site. The remote site then validates 
the credentials, and if the credentials are valid, grants limited 
access to the objects within the remote copy of the object set 13 
in question (step 88). The reconciler 68 then gains access to 
the remote objects in the object set (step 80). The local 
reconciler's range of access, however, is limited to only that 
which is necessary to perform proper reconciliation. It 
should be appreciated that the present invention is not 20 
limited to exclusively using proxies. Any technique that 
grants secure access, such as making each ORA a member 
of a common access group that grants access rights, is 
permissible. 

Most of the above discussion has focused on public 25 
replication. Private replication is similar to public replica- 
tion but includes a number of differences. In private repli- 
cation, the source of changes does not maintain a record of 
what objects were duplicated or changed. There is no state 
information maintained at the source. The source is not 30 
responsible for advising that changes have occurred. 
Accordingly, the resources that arc required for public 
replication arc not required. These characteristics make 
private replication especially appropriate for instances 
where manual control of replication is desired, or instances 35 
wherein the cost of maintaining a public copy of an object 
set is not warranted. 

While the present invention has been described with 
reference to a preferred embodiment thereof, those skilled in 
the art will appreciate that the various changes in form and 
detail may be made without departing from the scope of the 
present invention as defined in the appended claims. For 
example, the present invention need not be implemented in 
an object-oriented environment and need not be practiced 
solely in a distributed system configuration like that shown 45 
in FIG. 1A. Furthermore, communication mechanisms other 
than RPC mechanisms may be used for remote interactions, 
and security mechanisms other than proxies may be 
employed. 5Q 

We claim: 

1. In a distributed system having a replication facility and 
a number of computer systems that each include a storage 
device, a method comprising the steps of: 

providing a plurality of files organized into a tree of files; 55 
replicating a single one of the files that is stored in the 
storage device of one of the computer systems using the 
replication facility so that a copy of the file is stored in 
the storage device of another of the computer systems; 
and 60 
replicating a subtree of files of multiple levels, from the 
tree of files, that is stored in the storage device of one 
of the computer systems using the replication facility so 
that a copy of the subtree of files is stored in the storage 
device of another of the computer systems. 65 

2. The method of claim 1, further comprising the step of 
replicating the single file using the replication facility so that 
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a copy of the is stored in of the storage device of a third of 
the computer systems in the distributed system. 

3. The method of claim 1, further comprising the step of 
replicating the subtree using the replication facility so that a 
copy of the subtree is stored in the storage device of a third 
of the computer systems in the distributed system. 

4. The method of claim 3 wherein the subtree being 
replicated includes at least three levels of files. 

5. A distributed system comprising; 

a plurality of computer systems, each computer system 
including a storage device for storing files; 

a namespace manager for managing a namespace of the 
system a tree structure of names of the files; and 

a replication facility for replicating a subtree of the 
namespace that includes multiple levels. 

6. In a distributed system having a reconciler facility and 
a number of computer systems, a method comprising the 
steps of: 

providing a first copy of a file in one of the computer 
systems and a second copy of the file in another of the 
computer systems; 

reconciling the first copy of the file with the second copy 
of the file using the reconciler facility so that the second 
copy of the file incorporates any changes made to the 
first copy of the file since last reconciled; 

providing a first copy of a group of files in one of the 
computer systems and a second copy of the group of 
files in another of the computer systems; and 

reconciling the first copy of the group of files with the 
second copy of the group of files using the reconciler 
facility so that the second copy of the group of files 
incorporates any changes made to the first copy of the 
group of files since last reconciled. 

7. The method of claim 6 wherein the step of reconciling 
the first copy of the group of files with the second copy of 
the group of files further comprises the step of reconciling on 
a pair by pair basis each file in the first copy of the group of 
files with a corresponding file in the second copy of the 
group of files. 

8. In a distributed system having a replication facility and 
a number of computer systems, each including a storage 
device, a method comprising the steps of: 

providing a first copy of a group of files stored in the 
storage device of a first of the computer systems; 

providing a second copy of the group of files stored in the 
storage device of a second of the computer systems; 

making changes to files in the first copy of the group of 
files; 

propagating the changes to the second copy of the group 

of files upon occurrence of an event; 
making additional changes to files in the first copy of the 

group of flies; and 
propagating the additional changes to the second copy of 

the group of files upon occurrence of another event. 

9. The method recited in claim 8 wherein the event is the 
elapsing of a predetermined time period. 

10. The method recited in claim 9 where the other event 
is also die elapsing of a predetermined time period. 

11. The method of claim 8 wherein the event is a request 
by the second computer system to receive die changes. 

12. The method of claim 11 wherein the other event is a 
request by the second computer system to receive the 
additional change. 

13. The method recited in claim 8, further comprising the 
step of reconciling the second copy of the group of files with 
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the first copy of the group of files so that the second copy of 
the group of Mies incorporates the changes made to the first 
copy of the group of files. 

14. The method recited in claim 13. further comprising the 
step of reconciling the second copy of the group of files with s 
the first copy of the group of files so that the second copy of 
the group of files incorporates the additional changes made 

to the first copy of the group of files. 

15. In a distributed system having a replication facility 
and computer systems that each include a storage device, a 10 
method comprising the steps of: 

storing files, having names, in the storage devices of the 
computer systems; 

providing a distributed namespace comprising a logical 
organization of the names of the stored files; and 15 

replicating selected portions of a group of files stored in 
the storage devices of one of the computer systems and 
whose names form a part of the distributed namespace 
using the replication facility to create new files holding 2Q 
the selected portions of the files. 

16. The method recited in claim 15, further comprising the 
step of replicating the new files to distribute the new files 
across at least a portion of the computer systems of the 
distributed system. 

17. In a distributed system having a first computer system 
and a second computer system, a method comprising the 
steps of: 

providing a first copy of a set of files of a given class that 
are stored in the first computer system; 30 

providing a second copy of the set of files of the given 
class that are stored in the second computer system; 

reconciling the first copy of the set of files with the second 
copy of the set of files using a class-specific reconciler 
that only reconciles files of the given class. 35 

18. The method recited in claim 17, further comprising the 
steps of: 

making changes to the first copy of the set of files; 

reconciling the first copy of the set of files with the second ^ 
copy of the set of files using a class-independent 
reconciler that reconciles files regardless of class. 

19. In a distributed system having a private replication 
mechanism and computer systems for running processes that 
each include a storage device, a method comprising the steps A , 
of: 45 

running an application program on one of the computer 
systems; 

making a request to the private replication mechanism to 
replicate a set of files within the application program, 50 
each of the files maintaining a list of processes that arc 
permitted to access the file; and 

replicating the set of files using die private replication 
mechanism to produce a new set of files without 
replicating, for each file, the list of processes that arc 55 
permitted to access the file. 

20. In a distributed system having a first computer system 
and a second computer system, a method comprising the 
steps of: 

providing a collection of files at the first computer system; 60 
in response to a request to replicate the collection of files 
to the second computer system, determining whether 
all or none of the files in the collection should be 
replicated; 

65 

where it is determined that all of the files in the collection 
should be replicated, replicating all of the files in the 
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collection so that a replica of the collection is provided 
at the second computer system; and 
where it is determined that none of the files in the 
collection should be replicated, replicating none of the 
files in the collection. 

21. In a distributed system having a first computer system 
and a second computer system, a method comprising the 
steps of: 

providing a first copy of a group of files in the first 
computer system; 

providing a second copy of the group of files in the second 
computer system; 

making changes to the first copy of the group of files; 

providing an agent for the first copy of the group of files, 
wherein each agent has access rights to access and read 
the files in the first copy of the group of files; 

providing a reconciler at the second computer system for 
reconciling the second copy of the group of files with 
the first copy of the group of files; 

granting a proxy to the reconciler from the agent of the 
first copy of the group of files, said proxy granting the 
reconciler limited authority to access and read the files 
in the first copy of the group of files; and 

reconciling the second copy of the group of files with the 
first copy of the group of files using the reconciler so 
that the changes made to the first copy of the group of 
files are made to the second copy of the group of files. 

22. In a distributed system, a method comprising: 
providing heterogeneous file system in the distributed 

system; 

providing a storage manager for each file system to 
manage access to files in the file system; 

in response to a request to reconcile a first set of files with 
a second set of files, granting access to the first set of 
files by the storage manager for the file system that 
holds the first set of files and granting access to the 
second set of files by the storage manger for the file 
system that holds the second set of files; and 

reconciling the first set of files with the second set of files 
under control of the storage managers of the respective 
file systems holding the first set of files and the second 
set of files. 

23. The method of claim 22 wherein each copy of a file 
stored in die file systems is provided a storage-specific 
identifier by the storage manager. 

24. The method of claim 22 wherein each storage man- 
ager reports changes to the files in its file system. 

25. The method of claim 24 wherein the changes include 
deletions of files. 

26. The method of claim 24 wherein the changes include 
renaming of files. 

27. The method of claim 24 wherein the changes include 
moving of files in the distributed system. 

28. The method of claim 24 wherein the changes arc 
reported to a change log and wherein the step of reconciling 
is performed using the change log. 

29. The method of claim 22 wherein each copy of a file 
is assigned to a unique identifier and wherein the step of 
reconciling includes comparing identifiers to determine 
which files are to be reconciled. 

***** 
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